Location Analysis of NYC Michelin Star Restaurants
DSAN 6750 / PPOL 6805: GIS for Spatial Data Science
Author
Morgan Dreiss
Introduction
In 1889, a French tire company, Michelin, started to produce a book of helpful road-side information for Europe. Soon after, they started to include recommended restaurants. Then in 1926, they started awarding stars to these restaurants. By 1931, the three-star system was born.(Michelin Guide 2024) The Michelin Guide now rates over 30,000 establishments in over 30 territories across 3 continents. (“About the MICHELINGuide” n.d.)
Page from Guide Michelin France 1911
Michelin Stars are awarded annually, and they are looking for new Star awardees and revisit existing Star awarded restaurants to ensure they are maintaining their high standards.
Michelin Restaurant Stars are described by the following:
One MICHELIN Star: restaurants use top quality ingredients, where dishes with distinct flavors are prepared to a consistently high standard.(“What Is a MichelinStar?” n.d.)
Two MICHELIN Stars: the personality and talent of the chef are evident in their expertly crafted dishes; their food is refined and inspired.(“What Is a MichelinStar?” n.d.)
Three MICHELIN Stars: given for the superlative cooking of chefs at the peak of their profession; their cooking is elevated to an art form and some of their dishes are destined to become classics.(“What Is a MichelinStar?” n.d.)
Michelin Star Restaurants are decided by undercover Michelin Inspectors, who will visit the restaurant multiple times to determine if the quality is worthy of a star. A restaurant must apply for a Michelin Star or someone can nominate them to be considered. According the Michelin Guide, Inspectors rate the restaurants on the following criteria: (“What Is a MichelinStar?” n.d.)
Quality of products,
Mastery of flavor and cooking techniques,
Harmony of flavors,
The personality of the chef represented in the dining experience, and
Consistency between inspectors’ visits
The Michelin Guide CEO stated that “The star is only about the quality of the food; it’s not about the service and the setting.”(Robertson 2023) This statement and the criteria would have you believe that a restaurant does not have to overly fancy or expensive to be considered for a Michelin Star.
However, many have wondered if there are additional, hidden criteria for Michelin Star Restaurant selection or what factors affect a restaurant’s chances of being selected.
Literature Review
Most of the literature on the selection of Michelin Star Restaurants comes from the Michelin Guide itself, but given that obtaining a Michelin Star can greatly boost the prestige of a restaurant, it is no surprise that people have studied the anatomy of a Michelin Star Restaurant. One study that was conducted on multiple Michelin Star Restaurants in Europe found that the key factors that lead to their success was “investment and investment types, sources of financing, pursuit of excellence, and [the type of] culinary craftsmanship involved.”(Johnson et al. 2005) There are also a few blog posts and culinary articles that try and parse out what makes a Michelin Star Restaurant(Robertson 2023), however most of the literature is more interested in the effects on those establishments that obtain Michelin stars.
Some have dubbed it the “Michelin Effect,” (Bang, Choi, and Kim 2022) but many have been interested in the tangible benefits of obtaining a Star. One study found that Michelin Stars primarily “enhance social, hedonic and service quality values.”(Bang, Choi, and Kim 2022) while others focused on the “Michelin-induced price increase of approximately 30% per Michelin star.”(Gergaud, Storchmann, and Verardi 2015) However, not everything is sunshine-and-daisies for restaurants that finally obtain a Michelin Star. Another study found that for restaurants in New York City, receiving a Michelin Star corresponded to an increased likelihood of restaurant closure.(Sands 2024)
Whether or not a Michelin Star is worth while for a restaurant is not the aim of this project. This project is primarily concerned with if there is a correlation of restaurant location and likelihood of obtaining a star. I was unable to find any work that was done on this topic. Which brings us to the data question:
Data Question
Does the proximity to a previous Michelin Star Restaurant increase a restaurants’ chances of becoming one themselves?
Methodology
This project is going to focus on just one of the Michelin evaluated areas: New York City. New York City was the first Michelin Guide for North America and started in 2005, so there are quite a few years of Michelin Restaurants to work with. (“List of Michelin-Starred Restaurants in NewYorkCity” 2024)
Given that many Michelin Star Restaurants retain their stars for at least a year or two, I decided to put some temporal space between the years I would look at. I chose the years 2010 and 2020 so that there would be little overlap in Restaurants between the two sets. (There are 19 restaurants that appear on both the 2010 and 2020 list.)
The actual location analysis was conducted using point processes and intensity functions to generate simulated points (restaurants).
Warning: package 'stars' was built under R version 4.4.2
set.seed(6805)raster_eps <-150
Exploratory Data Analysis (EDA)
First, we load in the data sets for the Michelin Star Restaurants in New York City. For this project was are going to focus on Manhattan as it contains the majority of the Michelin Star Restaurants opposed to the boroughs. The past Michelin Guide Restaurant list was obtained from Wikipedia.(“List of Michelin-Starred Restaurants in NewYorkCity” 2024) The locations of the restaurants were generated by the Yelp API prior to importing the csv into this project. The details of this process can be requested.
# read in the sepereated year michelin filesmichelin_df_2010 <-read.csv('data/michelin_df_2010.csv')michelin_df_2020 <-read.csv('data/michelin_df_2020.csv')# read in the combined michelin filesmichelin_df <-read.csv('data/michelin_df_2010_2020.csv')michelin_df$Label <-as.factor(michelin_df$Label)# display the first few rows of the combined dfprint(head(michelin_df))
X Name Borough Address Latitude
1 1 A Voce Columbus Manhattan 88 Madison Ave New York NY 10016 40.74451
2 2 Adour Manhattan 715 Broadway Bayonne NJ 07002 40.67160
3 3 Ai Fiori Manhattan 400 5th Ave New York NY 10018 40.75008
4 4 Alto Manhattan 690 Bloomfield Ave Montclair NJ 07042 40.81699
5 5 Annisa Manhattan 450 E 29th St New York NY 10016 40.73927
6 6 Anthos Manhattan Brooklyn NY 11249 40.71859
Longitude Label
1 -73.98560 2010
2 -74.11198 2010
3 -73.98376 2020
4 -74.22201 2010
5 -73.97346 2010
6 -73.95946 2010
Once the data is loaded in, I turned them into sf objects to observe their locations within NYC.
# Create the sf object for both years of Michelin star restaurantsmichelin_sf <- michelin_df |> sf::st_as_sf(coords=c("Longitude", "Latitude"),crs=4326) |> sf::st_transform(3857)# create the sf object for 2010 restaurantsmichelin_sf_2010 <- michelin_df_2010 |> sf::st_as_sf(coords=c("Longitude", "Latitude"),crs=4326) |> sf::st_transform(3857)# create the sf object for the 2020 restaurantsmichelin_sf_2020 <- michelin_df_2020 |> sf::st_as_sf(coords=c("Longitude", "Latitude"),crs=4326) |> sf::st_transform(3857)# view the combined with the label to see bothmapview(michelin_sf, zcol ="Label", label ="Name", col.region = c_palette)
In order to conduct location analysis, I needed a population of restaurants to see where the majority of restaurants are located within the city. For this I used Outscraper, which queries all the restaurants for the Manhattan zip codes in Google Maps. The output file included the name and location of all the restaurants. Of note, this population of restaurants would be of restaurants that are open in 2024.(Yunus 2022)
# read in the dataframe of all restaurants in Manhattanrest_pop_df <-read.csv('data/outscraper_manhattan_restaurants.csv')# select the important columns rest_pop_df <- rest_pop_df |>select(c(name, street, latitude, longitude))# display the first few lineshead(rest_pop_df)
name
street
latitude
longitude
Friedman’s Herald Square
138 W 31st St
40.74851
-73.99089
Shukette
230 9th Ave
40.74715
-74.00052
Ci Siamo
440 W 33rd St Suite #100
40.75262
-73.99891
Alligator Pear
150 W 30th St
40.74823
-73.99211
L’Amico
849 6th Ave
40.74693
-73.99009
Pennsylvania 6
132 W 31st St
40.74833
-73.99083
I also wanted to view the restaurant population for NYC visually.
# create the sf object for the restaurant "population"rest_pop_sf <- rest_pop_df |> sf::st_as_sf(coords=c("longitude", "latitude"),crs=4326) |> sf::st_transform(3857)# view the population of restaurantsmapview(rest_pop_sf, col.region = c_palette)
It appears some of the locations were off for the queries restaurants and they fell outside of the zip codes that were queried. In addition, some of the 2020 Michelin Restaurants were outside of Manhattan. To resolve this, I am going to overlay the polygon of Manhattan form teh nycgeo library.
library(nycgeo)# load the borough boundaries as an sf objectboroughs_sf <- nycgeo::borough_sf# view the first few rows of the datasetprint(head(boroughs_sf))
old-style crs object detected; please recreate object with a recent sf::st_crs()
old-style crs object detected; please recreate object with a recent sf::st_crs()
Simple feature collection with 5 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 913174.7 ymin: 120124.9 xmax: 1067382 ymax: 272847.3
Projected CRS: NAD83 / New York Long Island (ftUS)
# A tibble: 5 × 7
geoid state_fips county_fips county_name borough_name borough_id
<chr> <chr> <chr> <chr> <chr> <chr>
1 36061 36 061 New York Manhattan 1
2 36005 36 005 Bronx Bronx 2
3 36047 36 047 Kings Brooklyn 3
4 36081 36 081 Queens Queens 4
5 36085 36 085 Richmond Staten Island 5
# ℹ 1 more variable: geometry <MULTIPOLYGON [US_survey_foot]>
# filter for just Manhattan since that is what our all restaurant data ismanhattan_sf <- boroughs_sf[boroughs_sf$borough_name =="Manhattan",] |> sf::st_transform(3857)
old-style crs object detected; please recreate object with a recent sf::st_crs()
old-style crs object detected; please recreate object with a recent sf::st_crs()
# and viewmapview(manhattan_sf, col.region ='#3C61A5')
Once I had the Manhattan polygon, I intersected all the restaurants sf objects with it to obtain just the Manhattan restaurants. The maps of all the restaurant sets are are below.
Manhattan Michelin Star Restaurants (2010 and 2024):
# create a buffer around Manhattan to catch an edge casesmanhattan_sf <- manhattan_sf |>st_buffer(10) # check the coordinate reference systemst_crs(manhattan_sf)
Coordinate Reference System:
User input: EPSG:3857
wkt:
PROJCRS["WGS 84 / Pseudo-Mercator",
BASEGEOGCRS["WGS 84",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
MEMBER["World Geodetic System 1984 (G2296)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]],
CONVERSION["Popular Visualisation Pseudo-Mercator",
METHOD["Popular Visualisation Pseudo Mercator",
ID["EPSG",1024]],
PARAMETER["Latitude of natural origin",0,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",0,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["False easting",0,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",0,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["easting (X)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["northing (Y)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Web mapping and visualisation."],
AREA["World between 85.06°S and 85.06°N."],
BBOX[-85.06,-180,85.06,180]],
ID["EPSG",3857]]
# intersect Manhattan with all out sf point objects to get just those contained in Manhattanmichelin_man_sf <-st_intersection(manhattan_sf, michelin_sf)
Warning: attribute variables are assumed to be spatially constant throughout
all geometries
Warning: attribute variables are assumed to be spatially constant throughout
all geometries
# view the michelin restaurants to confirm they are only in the correct areamichelin_man_sf |>mapview(zcol ="Label", label ="Name", col.region=c_palette)
Manhattan Restaurant Population:
rest_pop_man_sf |>mapview(col.region = c_palette)
Conduct Analysis
Now that I have been visually able to observe the data, I want to see the actual density of the restaurants through heat maps. Creating intensity functions are also the first step to conducting analysis since we will need the intensity function to generate simulated restaurants in the future.
First, I created the window for our point processes. In this case, it is all of Manhattan.
# create the window for our ppp objects from the Manhattan sfmanhattan_sfc <- manhattan_sf |> sf::st_convex_hull()class(manhattan_sfc)
[1] "sf" "tbl_df" "tbl" "data.frame"
I then converted all our sf objects to point pattern objects, or ppp objects.
PPP object for Manhattan Michelin Star Restaurants (2010 and 2024):
# create the ppp object from the michelin sf objects and the Manhattan windowmichelin_ppp <-as.ppp(sf::st_as_sfc(michelin_man_sf), W=as.owin(manhattan_sfc))
Warning: data contain duplicated points
# create the ppp object from the michelin sf objects and the Manhattan windowmichelin_ppp_2010 <-as.ppp(sf::st_as_sfc(michelin_man_sf_2010), W=as.owin(manhattan_sfc))
# create the ppp object from the michelin sf objects and the Manhattan windowmichelin_ppp_2020 <-as.ppp(sf::st_as_sfc(michelin_man_sf_2020), W=as.owin(manhattan_sfc))
# create the ppp object from our restaurant population and the Manhattan windowrest_pop_ppp <-as.ppp(sf::st_as_sfc(rest_pop_man_sf), W=as.owin(manhattan_sfc))
From the point pattern objects, I was able to construct the intensity functions. The intensity functions can then be mapped to a heat map to observe their concentrations by location.
Intensity Functions for Manhattan Michelin Star Restaurants (2010 and 2024):
# use the density function on the ppp object to get the intensity function of 2010mich_int_2010 <-density(michelin_ppp_2010,eps=raster_eps)# plot the intensity functionplot(mich_int_2010, main ='Michelin Star Restaurants Intensity - 2010')contour(mich_int_2010, add=TRUE)
# use the density function on the ppp object to get the intensity function of 2020mich_int_2020 <-density(michelin_ppp_2020,eps=raster_eps)# plot the intensity functionplot(mich_int_2020, main ='Michelin Star Restaurants Intensity - 2020')contour(mich_int_2020, add=TRUE)
Between the 2010 and 2020 Michelin Star Restaurants, we can see that there is a shift in the center of the heat map, in addition to a significant change in the contours of the map. The 2010 restaurants are more spread out with multiple contour rings. The 2020 restaurants’ center is shifted south and they are much more concentrated around the center.
Intensity Functions for Total Restaurant Population:
# use the density function to rest_pop_int <-density(rest_pop_ppp,eps=raster_eps)# plot the intensity functionplot(rest_pop_int, main ="Total Restaurant Population Intensity")contour(rest_pop_int, add=TRUE)
The total restaurant population intensity function is similar to the 2010 Michelin restaurants, with a slightly more northern center and multiple contour rings. However, the scale of the intensity is obviously much higher since this is for all restaurants in NYC, not a small subset as with the Michelin Restaurants.
I then validated the intensity functions by overlaying the points onto the functions.
Michelin Star Restaurants (2010):
michelin_2010_stars <- mich_int_2010 |> stars::st_as_stars()michelin_2010_points_sf <- michelin_ppp_2010 |> sf::st_as_sf() |>filter(label =="point")michelin_2010_points_sf |>ggplot() +geom_stars(data=michelin_2010_stars) +geom_sf() +scale_fill_viridis_c(option="C", na.value ="transparent") +labs(title ="Intensity & Point Overlay of 2010 Michelin Star Restaurants")+theme_classic()
Michelin Star Restaurants (2020):
michelin_2020_stars <- mich_int_2020 |> stars::st_as_stars()michelin_2020_points_sf <- michelin_ppp_2020 |> sf::st_as_sf() |>filter(label =="point")michelin_2020_points_sf |>ggplot() +geom_stars(data=michelin_2020_stars) +geom_sf() +scale_fill_viridis_c(option="C", na.value ="transparent") +labs(title ="Intensity & Point Overlay of 2020 Michelin Star Restaurants") +theme_classic()
Total Restaurant Population:
library(stars)rest_pop_stars <- rest_pop_int |> stars::st_as_stars()rest_pop_points_sf <- rest_pop_ppp |> sf::st_as_sf() |>filter(label =="point")rest_pop_points_sf |>ggplot() + stars::geom_stars(data=rest_pop_stars) +geom_sf() +scale_fill_viridis_c(option="C", na.value="transparent")+labs(title ="Intensity & Point Overlay of All Restaurants")+theme_classic()
Hypothesis Testing
To actually answer the initial question of if the proximity to a previous Michelin Star Restaurant increase a restaurants’ chances of becoming one themselves, I am going to conduct a hypothesis test. To do this, I will simulate multiple possible sets of Michelin Star Restaurants around Manhattan based on the total restaurant population and apply at test statistic to see how these simulated restaurant sets differ from the actual set of 2020 Michelin Star restaurants.
Test Statistic
The average distance between restaurants (points) will be our test statistic for the future hypothesis tests.
I created a function that measures a set of restaurant’s average distance to the set of 2010 Michelin Restaurants.
compute_dist <-function(set_b_sf){# Calculate pairwise distances between the points# first sf is going to be the 2010 Michelin restaurants of Manhattan distances <-st_distance(michelin_man_sf_2010, set_b_sf)# Convert the distance matrix to a vector and remove the diagonal (distance to itself) dist_vector <-as.vector(distances) dist_vector <- dist_vector[dist_vector !=0]# Calculate and return the average distance avg_distance <-mean(dist_vector)return(avg_distance)}
I then computed observed distance between the 2010 Michelin Star Restaurants and the 2020 Michelin Star Restaurants. The observed average distance was 3623 meters.
The null and alternative hypotheses, based on the EDA:
\(H_0\): The average distance between Michelin Star Restaurants is the same as between most restaurants. \(H_A\): The average distance between Michelin Star Restaurants is the less than between most restaurants.
Monte Carlo Simulation
As said above, I will do this be simulating sets of restaurants. I created a function to create an sf object of simulated restaurants, with the total restaurant population intensity function influencing the likelihood of where the points will be placed. The function creates the same number of points of 2020 Michelin Star Restaurants in Manhattan, 62.
set.seed(6805)# generate a number of points equal to the number of rows# in michelin_man_sf_2020, but with intensity function given by rest_pop_intmichelin_star_sim <-function() { michelin_sim <- spatstat.random::rpoint(n =nrow(michelin_man_sf_2020),f = rest_pop_int )# a ppp object is return from the spatstat.random function# turn it back into a sf object michelin_sim <- michelin_sim |> sf::st_as_sf()# set the crs to the rest of our data 3857st_crs(michelin_sim) <-3857return(michelin_sim)}
I ran a single simulation and observed the returned sf object.
# call the function once, and plot itsingle_michelin_sim_sf <-michelin_star_sim()single_michelin_sim_sf |>ggplot() +geom_sf() +labs(title ="Simulated 2020 Michelin Star Restaurants")+theme_classic()
This simulation of restaurants seems reasonable, but I also wanted to check the test statistic for this single simulation.
# generate the distance to all the new simulated Michelin star restaurantssingle_michelin_sim_distance <-compute_dist(single_michelin_sim_sf)single_michelin_sim_distance
[1] 5213.384
This average distance is much higher than our observed value, but I will run 999 simulations to continue with the analysis.
Looping through the following code 999 times, gave us a list of all the test statistic for each simulation.
set.seed(6805)# initialized empty list to store test statistics for each simulationall_sims_dist <-list()# number of desired simulationsnum_sims =999# run through all the simulationsfor (n in1:num_sims){# call the michelin_star_sim function that returns an sf object new_sim_sf <-michelin_star_sim()# compute the distance to the points of the new sf object new_dist <-compute_dist(new_sim_sf)# add the test statistic in the list all_sims_dist <-append(all_sims_dist, new_dist)}# check that all the distances were computedprint(length(all_sims_dist))
I converted the list into a dataframe to properly observe the distributions of average distances.
# turn our list of simulated distances into a dfsim_dist_df <-as.data.frame(do.call(rbind, all_sims_dist))sims_dist_mean <-mean(sim_dist_df$V1)# plot the density of distances plus the observed average distancesim_dist_plot <- sim_dist_df |>ggplot(aes(x=V1))+geom_density(fill=c_palette[2], alpha=0.5) +geom_vline(xintercept = obs_dist, color=c_palette[1]) +geom_vline(xintercept = sims_dist_mean, color=c_palette[3])+labs(title ="Density of Simulated Michelin Restaurant Average Distances",x ="Average Distance Between Restaurants (m)",y ="Density" )+theme_classic()sim_dist_plot
As you can see, the observed average distance is much lower than all the simulated test statistics.
When looking between the mean simulated test statistic and the observed test statistic, there is quite a large gap. And when looking at the p-values, we see that there are no simulations that are as low as the observed average distance.
At the 95% confidence interval, we reject the null hypothesis.
Hypothesis Test with Different Restaurant Population
The large gap between the simulated and observed test statistic made me question if the total restaurants in Manhattan was the proper population to create the simulated restaurant sets.
Technically there is no requirements for Michelin Star Restaurants to be expensive. However, most of the Michelin Star Restaurants do fall on the pricier end of the spectrum. With that in mind, I decided to run an additional set of simulations, but using expensive Manhattan restaurants as the restaurant population when simulating sets of restaurants.
Data Prep 2.0
I ran through most of the initial steps defined above prior to the hypothesis test, but for a new set of restaurants.
The total restaurant data set had the price range defined for each restaurant. I began by filtering the data frame to only include restaurants that were defined by three or four dollar signs (according to Google Maps). I then ran through all the same steps as above, converting to sf, filtering for just Manhattan, converting to ppp object, and generating the intensity function.
Filter dataframe:
# read in the dataframe of all restaurants in Manhattanrest_pop_df <-read.csv('data/outscraper_manhattan_restaurants.csv')# select the important columns# keeping range this time to filter for expensive restaurantsrest_pop_df <- rest_pop_df |>select(c(name, street, latitude, longitude, range))# convert the price range category into a factorrest_pop_df$range <-as.factor(rest_pop_df$range)# create a new df that only has the most expensive restaurantsexpensive_df <- rest_pop_df |>filter(range =="$$$"| range =="$$$$")# display the first few lineshead(expensive_df)
\[$ |
|miss KOREA BBQ |10 W 32nd St | 40.74731| -73.98648|\]$
Oscar Wilde
45 W 27th St
40.74516
-73.99007
$$$
Convert to sf object and filter for Manhattan:
# create the sf object for the expensive restaurant "population"expensive_sf <- expensive_df |> sf::st_as_sf(coords=c("longitude", "latitude"),crs=4326) |> sf::st_transform(3857)# intersect with Manhattan areaexpensive_man_sf <-st_intersection(manhattan_sf, expensive_sf)
Warning: attribute variables are assumed to be spatially constant throughout
all geometries
# view the population of restaurantsmapview(expensive_man_sf, col.region = c_palette)
Create ppp object:
# create the ppp object from our restaurant population and the Manhattan windowexpensive_ppp <-as.ppp(sf::st_as_sfc(expensive_man_sf), W=as.owin(manhattan_sfc))
# use the density function to expensive_int <-density(expensive_ppp,eps=raster_eps)# plot the intensity functionplot(expensive_int, main ="Expensive Restaurant Population Intensity")contour(expensive_int, add=TRUE)
Redefine the Simulation Function with the new Restaurant Population:
set.seed(6805)# generate a number of points equal to the number of rows# in michelin_man_sf_2020, but with intensity function given by rest_pop_intmichelin_star_sim2 <-function() { michelin_sim <- spatstat.random::rpoint(n =nrow(michelin_man_sf_2020),f = expensive_int )# a ppp object is return from the spatstat.random function# turn it back into a sf object michelin_sim <- michelin_sim |> sf::st_as_sf()# set the crs to the rest of our data 3857st_crs(michelin_sim) <-3857return(michelin_sim)}
Single Simulation:
# call the function once, and plot itsingle_michelin_sim_expensive_sf <-michelin_star_sim2()single_michelin_sim_expensive_sf |>ggplot() +geom_sf() +labs(title ="Simulated 2020 Michelin Star Restaurants (Expensive)")+theme_classic()
Just from doing one simulation, this could be more promising and more indicative of Michelin restaurant populations.
Monte Carlo Simulations with Expensive Restaurant Population
Looping through the following code 999 times, gave us a list of all the test statistic for each simulation.
set.seed(6805)# initialized empty list to store test statistics for each simulationall_sims_dist_2 <-list()# number of desired simulationsnum_sims =999# run through all the simulationsfor (n in1:num_sims){# call the michelin_star_sim function that returns an sf object new_sim_sf <-michelin_star_sim2()# compute the distance to the points of the new sf object new_dist <-compute_dist(new_sim_sf)# add the test statistic in the list all_sims_dist_2 <-append(all_sims_dist_2, new_dist)}# check that all the distances were computedprint(length(all_sims_dist_2))
I converted the list into a dataframe to properly observe the distributions of average distances.
# turn our list of simulated distances into a dfsim_dist_expen_df <-as.data.frame(do.call(rbind, all_sims_dist_2))sims_dist_expen_mean <-mean(sim_dist_expen_df$V1)# plot the density of distances plus the observed average distancesim_dist_plot2 <- sim_dist_expen_df |>ggplot(aes(x=V1))+geom_density(fill=c_palette[2], alpha=0.5) +geom_vline(xintercept = obs_dist, color=c_palette[1]) +geom_vline(xintercept = sims_dist_expen_mean, color=c_palette[3])+labs(title ="Density of Simulated Michelin Restaurant Average Distances (Expensive)",x ="Average Distance Between Restaurants (m)",y ="Density" )+theme_classic()sim_dist_plot2
As you can see, the observed average distance is still much lower than all the simulated test statistics, however the gap between the two isn’t as pronounced as our last test.
Looking at the values of the simulated and observed test statistic:
Again no simulated average distances come close to the observed value and there is still almost a 1000 meter difference between the observed and simulated average distances. So we once again, at the 95% confidence interval, we reject the null hypothesis.
Given these two tests, being in closer proximity to a current (or previous) Michelin Star Restaurant increases your chances to become a Michelin Star Restaurant in New York City.
Conclusion and Discussion
The results of these tests are really only indicative of the New York City Michelin Guide. It is likely that all the different regions that have Michelin Guides have different cultures and trends among themselves. These results make you question if there are additional, hidden criteria that result from inspector bias (or possibly overt, hidden criteria given from Michelin).
There also could be underlying criteria for what restaurants aspire to become Michelin Starred, and possibly that is what this project has captured. Either way, there are likely additional features of the restaurants that cause them to be closer in proximity to each other. However, it remains possible that simply by being located closer to others makes you more likely to become one.
Future work could be to try and uncover what those “hidden” features are, or the analysis could be conducted on other areas where Michelin has Guides.
Overall, if you are interested in your restaurant obtaining a Michelin Star in New York City, it would be safer for you to find a location that is in close proximity to other Michelin Star Restaurants.
Bang, Dohyung, Kyuwan Choi, and Alex Jiyoung Kim. 2022. “Does Michelin Effect Exist? An Empirical Study on the Effects of Michelin Stars.”International Journal of Contemporary Hospitality Management 34 (6): 2298–2319. https://doi.org/10.1108/IJCHM-08-2021-1025.
Gergaud, Olivier, Karl Storchmann, and Vincenzo Verardi. 2015. “Expert Opinion and ProductQuality: Evidence from NewYorkCityRestaurants.”Economic Inquiry 53 (2): 812–35. https://doi.org/10.1111/ecin.12178.
Johnson, Colin, Bernard Surlemont, Pascale Nicod, and Frederick Revaz. 2005. “Behind the Stars.”Cornell Hotel and Restaurant Administration Quarterly. Sage Publications. https://doi.org/10.1177/0010880405275115.
Sands, Daniel B. 2024. “Double-Edged Stars: Michelin Stars, Reactivity, and Restaurant Exits in NewYorkCity.”Strategic Management Journal n/a (n/a). https://doi.org/10.1002/smj.3651.